Skip to content

implement NxN conv (N>1)#25

Open
capitaso wants to merge 2 commits into
mit-han-lab:masterfrom
capitaso:impl_NxN_conv
Open

implement NxN conv (N>1)#25
capitaso wants to merge 2 commits into
mit-han-lab:masterfrom
capitaso:impl_NxN_conv

Conversation

@capitaso

@capitaso capitaso commented Nov 25, 2020

Copy link
Copy Markdown

Hello, I implemented NxN (N>1) convolution case in AMC. You can run test with VGG16 model as follows.
bash ./scripts/search_vgg16_0.5flops.sh

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso Thank you for implementing NxN conv. However, in your implementation, there is a serious error where could not run "least_square_sklearn" because of 4 dimension inputs.

Thanks,

@capitaso

Copy link
Copy Markdown
Author

@li-yun Thanks for reporting the error. Can you share the error massage? and what model did you try to prune?

The 4-dimensional inputs are reshaped into 2-dimensional matrix when using least_squares_sklearn, so that should not happen, but I may have done something wrong with it.

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso Sure. I tried to prune a pre-trained VGG16. I also added the following error message.

Traceback (most recent call last):
File "amc_search.py", line 233, in
train(args.train_episode, agent, env, args.output)
File "amc_search.py", line 132, in train
observation2, reward, done, info = env.step(action)
File "/home/liyun/model_compression/amc/env/channel_pruning_env.py", line 99, in step
action, d_prime, preserve_idx = self.prune_kernel(self.prunable_idx[self.cur_ind], action, preserve_idx)
File "/home/liyun/model_compression/amc/env/channel_pruning_env.py", line 284, in prune_kernel
rec_weight = least_square_sklearn(X=masked_X, Y=Y)
File "/home/liyun/model_compression/amc/lib/utils.py", line 130, in least_square_sklearn
reg.fit(X, Y)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/linear_model/_base.py", line 505, in fit
X, y = self._validate_data(X, y, accept_sparse=['csr', 'csc', 'coo'],
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/base.py", line 432, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 795, in check_X_y
X = check_array(X, accept_sparse=accept_sparse,
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/liyun/anaconda3/lib/python3.8/site-packages/sklearn/utils/validation.py", line 640, in check_array
raise ValueError("Found array with dim %d. %s expected <= 2."
ValueError: Found array with dim 4. Estimator expected <= 2.
liyun@ferrari:~/model_compression/amc$

Thank you for replying to my message.

@capitaso

Copy link
Copy Markdown
Author

@li-yun Thanks for the additional info. I quickly checked, and at least, I put something like below and it did not cause error.

python amc_search.py --job=train --model=vgg16 --ckpt_path=checkpoints/vgg16.pth --dataset=imagenet --data_root=../../datasets/ILSVRC2012 --preserve_ratio=0.5 --lbound=0.2 --rbound=1 --reward=acc_reward --n_calibration_batches 15 --seed 2018

Then, can you tell me, in what phase do you have the error? "strategy search" (--job=train) or "export" (--job=export)?

@capitaso

Copy link
Copy Markdown
Author

@li-yun Although I am not sure this is the cause of the error you have, I found a bug related to the first FC layer that happens in the exporting phase. Actually, I used AMC to prune the convolution layers only, and did not check about FC layers carefully. If you want to prune only convolution layers, the following change (in "env/channel_pruning_env.py" at line 24) may solve the problem. The bug will be fixed in the next few weeks.

  •    self.prunable_layer_types = [torch.nn.modules.conv.Conv2d, torch.nn.modules.linear.Linear]
    
  •    self.prunable_layer_types = [torch.nn.modules.conv.Conv2d]
    

@LiYunJamesPhD

LiYunJamesPhD commented Feb 15, 2021

Copy link
Copy Markdown

@capitaso The error was in the search phase. My plan is also going to prune the convolution layers.

I believe the error is in the line "rec_weight = least_square_sklearn(X=masked_X, Y=Y)" in "env/channel_pruning_env.py". masked_X and Y are 4 dimension inputs where are [3000, 16, 32, 32] and [3000, 64, 32, 32] in my case. Because their size is great than 2, the linear regression function in sklearn fails to perform linear regression. Do you have any thoughts?

Another thing is that I am planning to prune the VGG16 model that is trained on CIFAR10 rather than Imagenet. I doubt the command you provided is working for me.

Thanks

@capitaso

capitaso commented Feb 16, 2021

Copy link
Copy Markdown
Author

@li-yun And what is the kernel width and height in that layer? If its 1 x 1, there might be a problem at line 229. I will fix it, but please clarify the kernel width/height.

@LiYunJamesPhD

Copy link
Copy Markdown

I used a 3 by 3 kernel in that layer. Yeah. I agree with that.

@capitaso

Copy link
Copy Markdown
Author

@li-yun Then, the problem is something else... Can you share the whole network architecture? Pasting the output of "print(model)" will help.

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso Sorry to reply to the message later. Sure. The following is the network architecture.

vgg(
(feature): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(29): ReLU(inplace=True)
)
(classifier): Linear(in_features=512, out_features=10, bias=True)
)

Thanks

@capitaso

Copy link
Copy Markdown
Author

@li-yun Sorry, I'm late. I fixed a bit and committed. Can you try the new one? I think it should work now. If it still does not work, probably I need your source code to debug more.

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso Thank you so much!! I will try the new one.

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso I tried the new one, but I got a different error, which is IndexError: boolean index did not match indexed array along dimension 1; dimension is 65536 but corresponding boolean dimension is 64.

I guess the problem is in these lines.

231 k_size = int(X.shape[1] / weight.shape[1])
232 XX = X.reshape((X.shape[0],-1,k_size))
233 masked_X = XX[:, mask, :]
234 masked_X = masked_X.reshape((masked_X.shape[0],-1))

The shape of X and weight is [3000, 64, 32, 32] and [64, 64, 3, 3], respectively.

@LiYunJamesPhD

Copy link
Copy Markdown

@capitaso please skip the previous message. The code is working.

@Beeeam

Beeeam commented Mar 10, 2023

Copy link
Copy Markdown

Thank you for your work on this, and I implement the code. It works. But have you fix the accuracy problem of vgg model? I have the same problem you’ve mentioned.

@capitaso

Copy link
Copy Markdown
Author

@Beeeam Do you mean this problem? No, I could not fix it. After some struggling, I gave it up.

@Beeeam

Beeeam commented Mar 13, 2023

Copy link
Copy Markdown

@capitaso Thanks for your relpy. Besides I am also curious about the parameter 'n_points_eachlayer'. I used a larger one(from 10 to 20), but got a worse results.

@capitaso

Copy link
Copy Markdown
Author

@Beeeam I think i did that too, but changing the hyper-parameters did not work at all. And I have no idea what is the remaining problem...

But, my implementation is maybe ok. I fixed the pruning rate (did not use amc) and confirmed it worked well.

@Beeeam

Beeeam commented Mar 13, 2023

Copy link
Copy Markdown

@capitaso The hyper-parameters I found really importan is warmup...

Also, I am thinking that using filter pruning will help?

@capitaso

Copy link
Copy Markdown
Author

@Beeeam What do you exactly mean by filter pruning? Something like Fig. 1 (a) of this? If so, I have not tried it, but to my understanding filter pruning is equivalent to channel pruning in the previous layer.

@Beeeam

Beeeam commented Mar 13, 2023

Copy link
Copy Markdown

@capitaso filter_pruned_num = int(weight_torch.size()[0] * (1 - compress_rate)) select weight by this dimension. It seems finer-grained than channel pruning.

@capitaso

Copy link
Copy Markdown
Author

@Beeeam Sorry for my late reply. I did not find such code in env/channel_pruning_env.py. Can you please specify the lines you are mentioning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants